Scientific Python antipatterns advent calendar day two

A short one today - as a reminder, I’ll post one tiny example per day with the intention that they should only take a couple of minutes to read.

If you want to read them all but can’t be bothered checking this website each day, sign up for the mailing list:

Sign up for the mailing list

and I’ll send a single email at the end with links to them all.

Iterating through lists the hard way

Lots of the antipatterns that I’ll be covering in this series have their roots in other programming languages. Since Python is such a flexible, expressive language, it’s often possible to take patterns from other languages and apply them in Python, even when a better option is available.

Iterating through a list is a classic example. A core bit of syntax in all languages is to get the element in a list at a particular index:

fruits = ['apple', 'banana', 'grapefruit', 'strawberry']

# get the element at index 1
fruits[1]
'banana'

so when it comes time to iterate over a list, a natural pattern is to use an index variable. In programming languages like C, Java and Javascript we normally do this with a three-argument for loop something like this:

for (int i = 0; i < n; i++) {
    do_something(a[i]);
}

but that structure doesn’t exist in Python, so we have to find ways to reinvent it.

One common pattern is to make the iteration logic part of the loop body:

index = 0
while index < len(fruits):
    print(fruits[index])
    index = index + 1
apple
banana
grapefruit
strawberry

As we can see, this does work, but involves quite a bit of code. The other option in Python is to use range to generate the list of indices:

for index in range(len(fruits)):
    print(fruits[index])
apple
banana
grapefruit
strawberry

This is a little easier to read, and because range is lazy will not take a huge amount of memory.

There is, however, a better and more Pythonic way to do it - just iterate over the list directly:

for fruit in fruits:
    print(fruit)
apple
banana
grapefruit
strawberry

Shorter, more readable, and much more in the Python style.

But what if you need to skip some elements?

One situation where it seems like this won’t work is where we want a subset of the elements. For example, if we want to skip the first two elements it seems like we have no option but to start using range again:

# start at 2 rather than 0
for index in range(2, len(fruits)):
    print(fruits[index])
grapefruit
strawberry

The Pythonic way to solve this is with a slice; we can make a copy of the list with just the elements we want to process:

# get the elements of fruits starting at index 2
fruits[2:]
['grapefruit', 'strawberry']

and plug that expression directly into our loop:

for fruit in fruits[2:]:
    print(fruit)
grapefruit
strawberry

How about if we want just the even-numbered elements - in our case 0 and 2, but not 1 or 3. The manual version will work, if we increment by two rather than one:

index = 0
while index < len(fruits):
    print(fruits[index])
    index = index + 2
apple
grapefruit

or we could use range again with the three argument variation:

# go up in steps of 2
for index in range(0, len(fruits), 2):
    print(fruits[index])
apple
grapefruit

but again, the most Python way is to slice the list:

# from the start to the end in steps of 2
fruits[::2]
['apple', 'grapefruit']

then plug it into the loop:

for fruit in fruits[::2]:
    print(fruit)
apple
grapefruit

One more time; if you want to see the rest of these little write-ups, sign up for the mailing list:

Sign up for the mailing list